Queue system involving SRAM head, SRAM tail and DRAM body

ABSTRACT

A device for queuing information combines the speed of SRAM with the low cost and low power consumption of DRAM, affording substantial expansion of high-speed data storage in queues without corresponding increases in costs. The queues have a variable size, and provide fast, flexible and efficient data storage via an SRAM interface and a DRAM body. The queues may hold pointers to buffer addresses or other data that allow manipulation of information in the buffers via manipulation of the queues. Particular utility for this mechanism exists in situations for which high-speed access to queues is beneficial, flexible queue size is advantageous, and/or the smaller size and lower cost of DRAM compared to SRAM is of value.

MICROFICHE APPENDIX

A Microfiche Appendix comprising one sheet, totaling twenty-seven framesis included herewith.

COPYRIGHT NOTICE

A portion of the disclosure of this patent document contains materialthat is subject to copyright protection. The copyright owner has noobjection to the reproduction of the patent document or the patentdisclosure in exactly the form it appears in the Patent and TrademarkOffice patent file or records, but otherwise reserves all copyrightrights whatsoever.

TECHNICAL FIELD

The present invention relates to memory circuits for microprocessors.

BACKGROUND OF THE INVENTION

The operation of processors frequently involves temporary storage ofinformation for later manipulation. As is well known, data may be storedfor random access, or may be stored for access in an ordered fashionsuch as in a stack or queue. A queue stores data entries in sequentialfashion, so that the oldest entry in the queue is retrieved first. Theentry and removal of data in queues may be handled by a centralprocessing unit (CPU) processing software instructions.

Such a queue system can be a bottleneck in the efficient operation ofthe processor. For example, a first item of information obtained fromone process may need to be queued to wait for the processing of anotheritem of information, so that both items may then be manipulated togetherby the processor. The queuing and dequeuing of the first item ofinformation may require additional work of the processor, slowing theeventual processing of both items of information further. Morecomplicated situations involving multiple operands and operations causethe queuing and dequeuing complications to multiply, requiring variouslocks that absorb further processing power and time. The size andcomplexity of a microprocessor can lead to correspondingly large andcomplex arrangements for storing queues.

The allocation of memory space for these queues is also challenging, asthe queues can vary in length depending upon the type of operationsbeing processed. For example, a queuing scheme for a communicationsystem is described by Delp et al. in U.S. Pat. No. 5,629,933, in whicha number of data packets are stored in first-in, first out (FIFO) orderin queues that are segregated by session identity. Depending uponactivity of a particular session, the number of entries in such queuescould be very large or zero. In U.S. Pat. No. 5,097,442, Ward et al.teach programming a variable number into a register to store that numberof data words in a FIFO memory array, up to the limited size of thatarray.

To distribute memory for queuing different connections, U.S. Pat. No.5,812,775 to Van Seters et al. teaches a device for a router having anumber of network connections that dedicates specific buffers to eachnetwork connection as well as providing a pool of buffers for servicingany network connection. A number of static random access memory (SRAM)queues are maintained for tracking buffer usage and allocating buffersfor storage. While SRAM provides relatively quick access compared todynamic random access memory (DRAM), SRAM memory cells are much largerthan DRAM, making SRAM relatively expensive in terms of chip realestate.

SUMMARY OF THE INVENTION

The present invention provides a mechanism for queuing information thatis fast, flexible and efficient. The mechanism combines the speed ofSRAM with the low cost and low power consumption of DRAM, to enablesignificant expansion of high-speed data storage in queues withoutcorresponding increases in costs. The queues may be manipulated byhardware or software, and may provide processing events for anevent-driven processor. While the queuing mechanism of the presentinvention can be employed in many systems in place of conventionalqueues, particular utility is found where high speed access to queues isbeneficial, as well for situations in which flexible queue size may bean advantage, and/or for cases where the smaller size and lower cost ofDRAM compared to SRAM is of value.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a diagram of a plurality of queues of the present invention.

FIG. 2 is a diagram of the enqueuing and dequeuing of entries in a queueof FIG. 1.

FIG. 3 is a diagram of a network computer implementation of the queuesystem of the present invention.

FIG. 4 is a diagram of a plurality of status registers for the queues ofFIG. 3.

FIG. 5 is a diagram of a queue manager that manages movement of queueentries between various queues in the queue system of FIG. 3.

FIG. 6 is a diagram of a queue system that may be provided on a cardthat plugs into a computer or other device.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

FIG. 1 illustrates a plurality of hardware queues of the presentinvention, which may contain other such hardware queues as well. A firstqueue 20 is formed of a combination of SRAM 22 and DRAM 25 storageunits. A second queue 27 is similarly formed as a combination of SRAM 30and DRAM 25 storage units. The queues 20 and 27 each have an SRAM headand tail which can be used as an SRAM FIFO, and the ability to queueinformation in a DRAM body as well, allowing expansion and individualconfiguration of each queue. Connection between SRAM FIFOS 22 and 30 andDRAM 25 allows those queues 20 and 27 to handle situations in which theSRAM head and tail are fall. DRAM 25 may be formed on the sameintegrated circuit chip as SRAM FIFOS 22 and 30, or may be separatelyformed and then connected. The portion of DRAM 25 that is allocated tospecific queues such as 20 and 27 may be determined duringinitialization of the system containing the queues. SRAM FIFOS 22 and 30afford rapid access to the queues for enqueuing and dequeuinginformation, while DRAM 25 affords storage for a large number of entriesin each queue at minimal cost.

SRAM FIFO 22 has individual SRAM storage units, 33, 35, 37 and 39, eachcontaining eight bytes for a total of thirty-two bytes, although thenumber and capacity of these units may vary in other embodiments.Similarly, SRAM FIFO 30 has SRAM storage units 42, 44, 46 and 48. SRAMunits 33 and 35 form a head 50 of FIFO 22 and units 37 and 39 form atail 52 of that FIFO, while units 42 and 44 form a head 55 of FIFO 30and units 46 and 48 form a tail 57 of that FIFO. Information for FIFO 22may be written into head units 33 or 35, as shown by arrow 60, and readfrom tail units 37 or 39, as shown by arrow 62. A particular entry,however, may be both written to and read from head units 33 or 35, ormay be both written to and read from tail units 37 or 39, minimizingdata movement and latency. Similarly, information for FIFO 30 istypically written into head units 42 or 44, as shown by arrow 64, andread from tail units 46 or 48, as shown by arrow 66, but may instead beread from the same head or tail unit to which it was written. While aqueue of the present invention may include only one SRAM unit, theavailability of plural SRAM units can improve access to SRAM withoutobservable latency from data movement between SRAM and DRAM.

Queue 20 may enqueue an entry in DRAM 25, as shown by arrow 70, bydirect memory access (DMA) units acting under direction of a queuemanager, not shown in this figure, instead of being queued in the heador tail of FIFO 22. Entries stored in DRAM 25 return to SRAM unit 37, asshown by arrow 73, extending the length and fall-through time of thatFIFO. Diversion of information from SRAM to DRAM is typically reservedfor when the SRAM is full, since DRAM is slower and DMA movement causesadditional latency. Thus queue 20 may comprise the entries stored by thequeue manager in both the FIFO 22 and the DRAM 25. Likewise, informationbound for FIFO 30 can be moved by DMA into DRAM 25, as shown by arrow75. The capacity for queuing in cost-effective albeit slower DRAM 25 isuser-definable during initialization, allowing the queues to change insize as desired. Information queued in DRAM 25 can be returned to SRAMunit 46, as shown by arrow 77. Movement of information between DRAM andSRAM can be coordinated so that devices utilizing the queue experienceSRAM speed although the bulk of queued information may be stored inDRAM.

The queue system of the present invention may vary in size and may beused with various devices. Such a queue system may be particularlyadvantageous for devices that benefit from rapid processing of largeamounts of data with plural processors. A preferred embodiment describedin detail below and in Verilog code in the microfiche appendix includesa queue manager, SRAM and DRAM controllers and a number of queues thatmay be used with a network communication device.

FIG. 2 depicts the enqueuing and dequeuing of entries in queue 20 for adevice 10 such as a processor. When device 10 wants to store data in aqueue, information regarding that data is sent to a queue manager 12,which manages entries in multiple queues such as queue 20. Queue manager12 includes a queue controller 14 and DMA units Q2D 16 and D2Q 18, whichmay be part of a number of DMA units acting under the direction of queuecontroller 14. DMA units Q2D 16 and D2Q 18 may be specialized circuitryor dedicated sequencers that transfer data from SRAM to DRAM andvice-versa without using the device 10. The queue controller 14 entersthe data from device 10 in the head 50 of queue 20, which is composed ofSRAM. Should the information be needed again shortly by device 10, thequeue controller can read the entry from head 50 and send it back todevice 10. Otherwise, in order to provide room for another entry in head50, DMA unit Q2D 16 moves the entry from the SRAM head 50 to DRAM body25. Entries are dequeued to device 10 from queue 20 in a similarfashion, with device 10 requesting controller 14 for the next entry fromqueue 20, and receiving that entry from tail 52 via controller 14. DMAunit D2Q 18, operating as a slave to controller 14, moves entriessequentially from body 25 to SRAM tail 52, so that entries areimmediately available for dequeuing to device 10.

FIG. 3 focuses on a queuing system integrated within a networkcommunication device 160 for a host 170 having a memory 202 and a CPU205. The device 160 is coupled to a network 164 via a media accesscontroller 166 and a conventional physical layer interface unit (PHY),not shown, and coupled to the host 170 via a PCI bus 168. The device 160maybe provided on the host 170 motherboard or as an add-on networkinterface card for the host. Although a single network connection isshown in this figure for brevity, the device 160 may offer full-duplexcommunication for several network connections, partly due to the speedand flexibility of the queuing system. Processing of communicationsreceived from and transmitted to the network 164 is primarily handled byreceive sequencer 212 and transmit sequencer 215, respectively. A queuearray 200, which may include thirty-two queues in this embodiment,contains both DRAM 203 and SRAM 206, where the amount of DRAM 203earmarked for the queue system can vary in size. The DRAM 203 and SRAM206 are used for other functions besides the queue array 200, and may beformed as part of the device or may be separately formed and attached tothe device. The device 160 includes a communications microprocessor 208that interacts with the CPU 205 and host memory 202 across PCI bus 168via a bus interface unit 210. A queue manager 220 helps to manage thequeue array 200, via DRAM controller 211 and SRAM controller 214.

Status for each of the hardware queues of the queue array 200 isconveniently maintained by and accessed from a set 80 of four registers,as shown in FIG. 4, in which a specific bit in each register correspondsto a specific queue. The registers are labeled Q-Out_Ready 82,Q-In_Ready 84, Q-Empty 86 and Q-Full 88, and for the thirty-two queueembodiment the registers each have thirty-two bits. If a particular bitis set in the Q-Out_Ready register 82, the queue corresponding to thatbit contains information that is ready to be read, while the setting ofthe same bit in the Q-In_Ready register 84 means that the queue is readyto be written. Similarly, a positive setting of a specific bit in theQ-Empty register 86 means that the queue corresponding to that bit isempty, while a positive setting of a particular bit in the Q-Fullregister 88 means that the queue corresponding to that bit is full.Q-Out_Ready 82 contains bits zero 90 through thirty-one 99 in thethirty-two queue embodiment, including bits twenty-seven 95,twenty-eight 96, twenty-nine 97 and thirty 98. Q-In_Ready 84 containsbits zero 100 through thirty-one 109, including bits twenty-seven 105,twenty-eight 106, twenty-nine 107 and thirty 108. Q-Empty 86 containsbits zero 110 through thirty-one 119, including bits twenty-seven 115,twenty-eight 116, twenty-nine 117 and thirty 118, and Q-full 88 containsbits zero 120 through thirty-one 129, including bits twenty-seven 125,twenty-eight 126, twenty-nine 127 and thirty 128.

Operation of the queue manager 220, which manages movement of queueentries between SRAM and the microprocessor, the transmit and receivesequencers, and also between SRAM and DRAM, is shown in more detail inFIG. 5. Requests, which utilize the queues, include Processor Request222, Transmit Sequencer Request 224, and Receive Sequencer Request 226.Other requests for the queues are DRAM to SRAM Request (D2Q Seq Req) 228and SRAM to DRAM Request (Q2D Seq Req) 230, which operate on behalf ofthe queue manager in moving data back and forth between the DRAM and theSRAM head or tail of the queues. Determining which of these variousrequests will get to use the queue manager in the next cycle is handledby priority logic Arbiter 235. To enable high frequency operation thequeue manager is pipelined, with Register-A 238 and Register-B 240providing temporary storage, while Status Registers Q_Out_Ready 265,Q_In_Ready 270, Q_Empty 275, and Q_Full 280 maintain status until thenext update. The queue manager reserves even cycles for SRAM to DRAM,DRAM to SRAM, receive and transmit sequencer requests and odd cycles forprocessor requests. Dual ported QRAM 245 stores variables regarding eachof the queues, the variables for each queue including a Head WritePointer, Head Read Pointer, Tail Write Pointer and Tail Read Pointercorresponding to the queue's SRAM condition, and a Body Write Pointer, aBody Read Pointer and a Queue Size Variable corresponding to the queue'sDRAM condition and the queue's size.

After Arbiter 235 has selected the next operation to be performed, thevariables of QRAM 245 are fetched and modified according to the selectedoperation by a QALU 248, and an SRAM Read Request 250 or an SRAM WriteRequest 255 may be generated. The four queue manager registersQ_Out_Ready 265, Q_In_Ready 270, Q_Empty 275, and Q_Full 280 are updatedto reflect the new status of the queue that was accessed. The status isalso fed to Arbiter 235 to signal that the operation previouslyrequested has been fulfilled, inhibiting duplication of requests. Alsoupdated are SRAM Addresses 283, Body Write Request 285 and Body ReadRequests 288 which are used by DMA CONTROLLER 214 while moving databetween SRAM head and DRAM body as well as SRAM tail and DRAM body. Ifthe requested operation was a write to a queue, data as shown by Q WriteData 264, are selected by multiplexor 266, and pipelined to SRAM WriteData register 260. The SRAM controller services the read and writerequests by reading the tail or writing the head of the accessed queueand returning an acknowledge. In this manner the various queues can beutilized and their status updated.

The array of queues 200 contained within the communication device 160may include thirty-two queues, for example. At the beginning ofoperation the device memory is divided into a number of large (2kilobyte) and small (256 byte) buffers, and pointers denoting theaddresses of those buffers are created. These pointers are placed in alarge free buffer queue and a small free buffer queue, respectively.Over time, as various operations are executed, these free buffer queuesoffer a list of addresses for buffers that are available to thecommunication device 160 or other devices. Due to the potential numberof free buffer addresses, these free buffer queues commonly includeappreciable DRAM 203 in order to provide sufficient room for listing thebuffers available to any device in need of a usable buffer. Note thatthe queue entries need not be pointers but may, for example, comprisethirty-two bits of control information that is used for communicatingwith or controlling a device. Another example of a variable capacityqueue that may contain a significant amount of DRAM 203 is a traceelement queue, which can be used to trace various events that haveoccurred and provide a history of those events, which may for instancebe useful for debugging.

FIG. 6 shows a queue system 300 that may be provided on a card that canplug into a computer or similar device. The queue system contains anarray of queues that may include both SRAM 303 and DRAM 305. The queuesystem may be formed as a single ASIC chip 308, with the exception ofDRAM 305. The DRAM 305 may be provided on the card as shown or may existas part of the computer or other device and be connected to the card bya bus. The system 300 may connect to a microprocessor 310 via amicroprocessor bus 313, with a microprocessor bus interface unit 316translating signals between the microprocessor bus and a queue manager320. The queue manager 320 controls DMA units 323 and an SRAM controller325 that can also control the DMA units 323. SRAM controller 325 and DMAunits 323 can also interact with a DRAM controller 330, manages andmaintains information in DRAM 305.

While the above-described embodiments illustrate several implementationsfor the queue system of the present invention, it will be apparent tothose of ordinary skill in the art that the present invention may beimplemented in a number of other ways encompassed by the scope of thefollowing claims. Examples of such implementations include employmentfor network routers and switches, controllers of peripheral storagedevices such as disk drives, controllers for audio or video devices suchas monitors or printers, network appliance controllers andmultiprocessor computers.

What is claimed is:
 1. An information storage device comprising: ahardware storage queue configured for storage of data in a sequentialorder, said queue containing an SRAM storage mechanism coupled to a DRAMstorage mechanism such that said data is retrieved from said queue insaid sequential order, wherein said queue has a head and a tail formedof said SRAM storage mechanism and a body formed of said DRAM storagemechanism.
 2. The device of claim 1 wherein said DRAM storage mechanismhas a variable storage capacity.
 3. The device of claim 1 wherein saidSRAM storage mechanism is an interface for said DRAM storage mechanism.4. The device of claim 1 wherein all of said data being stored in saidDRAM storage mechanism was previously stored in said SRAM storagemechanism.
 5. The device of claim 1 wherein said data includes a firstpointer to a first memory buffer, a second pointer to a second memorybuffer and a third pointer to a third memory buffer, said first pointerbeing stored in said head, said second pointer being stored in saidbody, and said third pointer being stored in said tail.
 6. The device ofclaim 5 further comprising a queue manager configured for managingmovement of said data between said SRAM storage mechanism and said DRAMstorage mechanism.
 7. The device of claim 6 further comprising aplurality of DMA units controlled by said queue manager for moving saiddata between said SRAM and DRAM storage mechanisms.
 8. The device ofclaim 7, wherein said queue manager and said DMA units maintain aplurality of information storage queues, each of said informationstorage queues including a head formed of said SRAM storage mechanism, abody formed of said DRAM storage mechanism, and a tail formed of saidSRAM storage mechanism.
 9. A system for arranging information for adevice, the system comprising: a DRAM array including a number ofbuffers for storing data, and an information storage queue containingpointers to said buffers, said queue having a body disposed in said DRAMarray, said queue having a head disposed in an SRAM memory, said queuehaving a tail disposed in said SRAM memory.
 10. The system of claim 9wherein said queue is a part of an array of queues having SRAM heads,SRAM tails, and DRAM bodies.
 11. The system of claim 10 furthercomprising a queue manager, said queue manager managing said queues bymanaging a movement of said pointers into said head, and from said headto said body, and from said body to said tail, and out of said tail. 12.The system of claim 11 further comprising a plurality of registers formaintaining a status of said queues.
 13. The system of claim 12 whereineach of said registers has a bit corresponding to one of said queue,said bit indicating a status of said one queue.
 14. The system of claim9 wherein the device includes a plurality of processors, and said queueprovides a communication mechanism between said processors.
 15. A queuefor storing information for a processor, the queue comprising: a headincluding a first SRAM storage unit, a tail including a second SRAMstorage unit, and a body including a DRAM storage unit, wherein a seriesof entries are stored in a sequential order in said head, body and tail,such that said entries are retrieved by the processor in said sequentialorder.
 16. A network communication device adapted for communication viaa network, comprising: static memory comprising a queue head and a queuetail; dynamic memory comprising a queue body; a receive device thatreceives a network communication from the network; a processor; and aqueue manager that maintains a first queue and a second queue, thesecond queue being a free buffer queue, the first queue comprising thequeue head, the queue tail, and the queue body, the queue managerreceiving from the receive device a request to perform an operation onthe first queue, the queue manager receiving a request from theprocessor to perform an operation on the first queue.
 17. The networkcommunication device of claim 16, wherein the receive device is areceive sequencer.
 18. The network communication device of claim 17,wherein the free buffer queue includes a plurality of entries, eachentry of the free buffer queue pointing to a buffer in a memory.
 19. Thenetwork communication device of claim 18, wherein the queue managerreceives a request to write a head entry to the first queue, and whereinin response the queue manager returns an acknowledge signal.
 20. Thenetwork communication device of claim 19, wherein the queue managerincludes an arbiter, the arbiter determining which of a plurality ofrequests received by the queue manager will be executed by the queuemanager next.
 21. The network communication device of claim 20, whereinthere are first cycles and second cycles, wherein on the first cyclesthe queue manager does not handle requests from the receive device, andwherein on the second cycles the queue manager does not handle requestsfrom the processor.
 22. The network communication device of claim 16,further comprising a plurality of memory buffers, the first queuestoring a queue entry, the queue entry including a pointer to one of theplurality of memory buffers.
 23. The network communication device ofclaim 22, wherein the queue entry includes control information forcontrolling a device.
 24. The network communication device of claim 16,wherein the receive device, the processor, the static memory and thequeue manager are integrated on the same integrated circuit chip. 25.The network communication device of claim 16, wherein the queue manageris pipelined.
 26. The network communication device of claim 16, whereinthe queue manager comprises means for storing an indication of whetherthe queue is empty.
 27. A network communication device adapted forcommunication via a network, comprising: static memory comprising aqueue head and a queue tail; dynamic memory comprising a queue body; areceive sequencer that receives a network communication from thenetwork; a processor; and means for maintaining a queue, the queuecomprising the queue head, the queue tail, and the queue body, the meansalso being for receiving from the receive sequencer a request to performan operation on the queue, the means receiving a request from theprocessor to perform an operation on the queue.
 28. The networkcommunication device of claim 27, wherein the network communicationdevice is coupled to a host computer, the means including a queuemanager and a plurality of DMA units coupled to the queue manager, thequeue manager also including means for storing a status of the queue.29. The network communication device of claim 27, wherein the networkcommunication device includes a plurality of memory buffers, and whereina plurality of queue entries is stored in the queue, each of the queueentries including a pointer to one of the memory buffers, the means alsomaintaining a second queue, the second queue being a free buffer queue.